WhereNext is a recommendation system designed to optimize the process of filling gaps in biodiversity knowledge. As such, it may help in the process of planning biological surveys, prioritize sampling from a list of preselected sites and even prioritize regions to mobilize data from (e.g. by digitizing local collections). WhereNext relies on open biodiversity and environmental data, but also provides users with options to upload their own datasets. Though devised to be run using a graphic interface, WhereNext functions may be called directly from the console and integrated into custom data analysis workflows. Besides WhereNext recommendations, the intermediate products generated by the app maybe useful in other analysis, such as raw and cleaned occurrences downloaded from GBIF, cropped and masked WorldClim data, cell sampling completeness estimations and the spatial representation of community dissimilarity.
WhereNext is based on generalized dissimilarity modeling (GDM), a statistical method to model and predict spatial patterns of dissimilarity in community composition (Ferrier et al. 2007). In GDM, site-pair community dissimilarity (measured through a dissimilarity index, such as Bray-Curtis or Jaccard) is modeled as a function of the respective differences in environmental variables among sites. This modeling permits estimation of the expected dissimilarity between arbitrary pairs of sites, for example, between surveyed and unsurveyed sites.
To optimally fill gaps in biodiversity knowledge, WhereNext uses the finding by Faith & Walker (1996) that the number of species sampled by a set of sites will be maximized if on average the biological distance from any individual site to its nearest neighbor is as small as possible. Thus, each iteration of the “Recommend Survey” component of WhereNext, will find the grid cell that minimizes the biological distance to all already sampled places, in other words, the most complementary.